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Abstract 

Prior research has established that gender differences in self-perceptions exist. For example, women's post-task 
self-evaluations of performance are lower than men's, especially on masculine gender-typed tasks (Beyer, 1990, 
1991). It was hypothesized that self-consistency tendencies can partially explain gender differences in self- 
evaluations. According to self-consistency theory, subjects' expectancies should affect their post-task self- 
evaluations. The results confirmed this hypothesis. It'was also assessed whether biased recall of one's 
performance on individual questions can paitially explain gender differences in the accuracy of self-evaluations. 
It was found that indeed males were relatively more likely than females to recall those questions which they 
wrongly believed they had answered correctly. 

Introduction 

Women have lower expectancies of success than men (e.g., Beyer, 1987, 1990, 1991, 1992; Daubman, 
Heatherington, & Ahn, 1992; Elliot & Harackiewicz, 1994; Mura, 1987). Carr, Thomas, and Mednick (1985) 
suggested that these low expectancies are indicative of women's tendency to underestimate their abilities. 
Research on causal attributions has also produced evidence for women's putative underestimation of abilities. 
Women tend to attribute success more externally (Meehan & Overton, 1986), or more to effort rather than 
ability, than do men (LaNoue & Curtis, 1985; Parsons, Meece, Adler, & Kaczala, 1982). Some researchers 
suggest that by making external attributions for success, women are not taking credit for their performance, 
thereby showing a "self-derogatory" bias (Erkut, 1983). Females have also been found to have lower self- 
evaluations of performance than men, despite equal performance (Rustemeyer, 1982). 

However, research on expectancies and attributions has not yet investigated whether women's self- 
perceptions are inaccurately low. For example, even though women's expectancies are lower than men's, their 
expectancies may be realistic, whereas men's expectancies may be overly optimistic. Thus, research has 
established that gender differences in expectancies and causal attributions exist but has failed to investigate the 
accuracy of women's and men's self-perceptions. One reason for this state of affairs is that "the study of 
accuracy in self-perception has been impeded by the 'criterion problem 1 : the lack of objective criteria against 
which self- perceptions can be compared" (John & Robins, 1994, p. 206). 

Beyer (1987, 1990, 1991, 1992) solved the criterion problem by comparing subjects' self-evaluations of 
performance to their actual performance. The difference between self-evaluations and performance indicates the 
level of accuracy of self-perceptions. Beyer found that women's self-evaluations are not only lower than men's 
but are in fact inaccurately low on masculine gender-typed tasks. 

Unlike research on gender, research on depression and self-esteem has recognized the importance of 
assessing the accuracy of self-perceptions. Some theories of depression (Beck, 1976) and self-esteem (Fitch, 
1970) presumed that the self-perceptions of depressives and low self-esteem individuals were negatively 
distorted. However, many researchers have found evidence for depressive realism, also called the "sadder but 
wiser" phenomenon. For example, depressives are more accurate than nondepressives in their evaluations of 
social competence (Lewinsohn, Mischel, Chaplin, & Barton, 1980; McNamara & Hackett, 1986), recall of their 
toddlers' negative behaviors (Lovejoy, 1991), estimates of future success and failure (Alloy & Ahrens, 1987; 
Golin, Terrell, & Johnson, 1977; Golin, Terrell, Weitz, & Drost, 1979), estimates of positive and negative 
events that might happen to them (Crocker, Alloy, & Kayne, 1988), assessments of the degree of control over 
external stimuli (Abramson & Alloy, 1981; Alloy & Abramson, 1979, 1982; Alloy, Abramson, & Kossman, 
1985; Alloy, Abramson, & Viscusi, 1981; Dobson & Franche, 1989; Glass, McKnight, & Valdimarsdottir, 
1993; Martin, Abramson, & Alloy, 1984; Vazquez, 1987), and estimates of the extent of received punishment 
(Nelson & Craighead, 1977). Depressives also show less bias towards recalling flattering self-descriptions than 
nondepressives (Rude, Krantz, & Rosenhan, 1988). In addition, compared to nondepressives, mildly 
depressed individuals are more sensitive to changes in reward contingencies (Rosenfarb, Burker, Moms, & 
Cush, 1993). 

However, DeMonbreun and Craighead (1977), Dobson and Shaw (1981), Gotlib (1981, 1983), and Wener 
and Rehm (1975) found no evidence for depressive realism. With the exception of Wener and Rehm (1975) 
these studies used psychiatric patients as subjects rather than dysphoric undergraduates. Thus, a boundary 
condition for depressive realism may be the severity of depression: Only mildly depressed individuals may 
show depressive realism with more depressed individuals showing negative biases. 

Like mild depressives, low self-esteem individuals have more accurate self-evaluations of performance than 
do high self-esteem individuals (Shraugcr & Kelly, 1988; Shrauger & Terbovic, 1976). For example, research 
on self-assessments of attractiveness indicates that most high self-esteem subjects, and males in general, tend to 
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overestimate their own attractiveness (Gurman & Balban, 1990). To summarize, research suggests thr. mildly 
depressed and low self-esteem individuals tend to have realistic self-perceptions. Studies on the effect of gender 
on the accuracy of self-perceptions indicate that on masculine tasks women tend to have overly negative self- 
perceptions. The present research will investigate the accuracy of depressives' self-evaluations of performance 
as a further test of the depressive realism hypothesis. In addition, gender differences in the accuracy of self- 
evaluations will be assessed. 

Why might women inaccurately assess i.e., underestimate their performance on masculine tasks? There are 
reasons to believe that women and men fall prey to different self-perception biases. According to self- 
consistency theory "people interpret and judge their achievements and abilities in ways congruent with prior 
self-conceptions" (Jussim, Coleman, & Nassau, 1987, p. 95). Thus, a person's expectancy biases how 
performance on a task is interpreted. This is especially true when there is some ambiguity regarding the quality 
of performance, as in the absence of feedback (Felson, 1981; Wells & Sweeney, 1986). Therefore, self- 
consistency should result in inaccurate sell-evaluations on those occasions when expectancies do not coincide 
with performance. 

Because women have low expectancies for masculine tasks (e.g., Beyer, 1987, 1990, 1991 , 1992; Janman, 
1987), self-consistency theory predicts that they should evaluate their performance negatively. Conversely, 
men's high expectations should lead to high self-evaluations. Interestingly, there seems to exist a gender 
difference in the strength of self-consistency tendencies. Beyer (1990, 1991, 1992) found that women were 
more prone to self-consistency tendencies than men. A replication of this finding will be attempted. 

Self-consistent tendencies may be revealed in yet another way. People selectively attend to information that 
is consistent with their self-views and show superior recall for such information (Swann & Read, 1981). For 
example, gender-consistent information is more likely to be recognized than gender-iaconsistent information 
(Markus, Crane, Bernstein, & Sidali, 1982; Stangor, 1988). Patients suffering from panic attacks, show a bias 
toward threat stimuli compared to controls (Cloitre & Liebowitz, 1991). When processing self-referent material, 
depressives recall more negative and self-esteem threatening words or memories than positive words or 
memories, while normal controls recall more positive than negative or self-esteem threatening words (Bellew & 
Hill, 1990; Bradley & Mathews, 1983; Kuiper, dinger, MacDonald, & Shaw, 1985; Lloyd & Lishman, 1975; 
McDowall, 1984). Thus, in many situations schema-consistent information seems to be attended to and recalled 
better than schema-inconsistent information. 

It is possible that when evaluating their performance on masculine gender-typed tasks, women's recall of 
previously answered questions is biased by preexisting negative self-perceptions. Conceivably, if women have 
negative views or expectations to begin with, they may subsequently remember mostly those questions they 
believe they answered incorrectly, whereas men remember the questions they believe they answered correctly. 
This process could also bias women towards underestimation of their performance. 

This experiment also was designed to determine whether women's inaccurately low self-evaluations would 
only be manifested when making overall self-evaluations or also at a more fundamental level, after assessing 
their performance for each individual question of a test. In other words, if subjects have to constantly monitor 
their performance, does (his increase the accuracy of their self-evaluations? 

In summary, this experiment tested the following hypotheses: 1. Depressed subjects are more accurate in 
their self-perceptions than nondepressed subjects (depressive realism). 2. Self-consistency tendencies can 
predict gender differences in seli^evaluations. Women are hypothesized to show stronger self-consistency 
tendencies than men. 3. On the masculine task, women remember more of the questions they answered 
incorrectly than do men. 4. Subjects will be more accurate when constandy monitoring their performance than 
when only evaluating their overall performance. However, women's lower self-evaluations will be evident 
already when evaluating their performance on individual questions. 

Method 

Subjects. Subjects were 293 female and 174 male students at the University of Wisconsin-Parkside. 

Tasks. Subjects were presented with either a feminine, masculine, or neutral gender-typed task, each 
containing 30 multiple-choice questions. The masculine task contained mathematics questions, the feminine task 
questions on grammar and syntax, and the neutral task questions on history and geography. Based on 
pretesting, the masculine and feminine tasks were constructed so that both genders would answer approximately 
75% of the questions correctly on the gender-congruent task and 60% on the gender-incongruent task. The 
neutral task was constructed so that both genders would answer approximately 75% of the questions correctly. 

Procedure. Subjects were randomly assigned to conditions and tasks. Subjects filled out the Beck 
Depression Inventory (BDI) first and subsequently were given information about the task they were about to 
perform (given sample questions, information about the number of questions, length of time available, etc.). 
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In the nonmonitoring condition subjects stated performance expectancies, performed the task, then estimated 
the number of correctly answered questions (self-evaluation) without receiving feedback regarding their 
performance. They then had to recall as many of the questions that had appeared on the task as possible and 
indicate for each recalled question, whether they believed they had answered that question correctly or 
incorrectly. 

There was only one addition to this procedure in the monitoring condition. Immediately after answering 
each of the 30 multiple-choice questions, subjects stated how confident they were of having answered that 
particular question correctly. Confidence ratings could range from 0% to 100% sure. Only after completing 
these 30 ratings of question confidence did these subjects evaluate their overall performance. Thus, in the 
monitoring condition subjects were forced to constantly pay attention to their performance. 

Results 

Results were analyzed with 3 (gender-typedness of task: feminine, masculine, or neutral) x 2 (monitoring 
condition: monitoring vs. nonmonitoring) x 2 (gender) analyses of variance (ANOVAs). 

Depressive realism . Accuracy scores were regressed on depression scores, gender, gender-typedness of 
task, monitoring condition, and all the interaction terms in multiple regression analyses. Both gender and 
monitoring condition interacted with depression scores, ECU 458) = 4.32, p. < .04, F(l, 458) = 3.79, p < .06, 
respectively. Therefore separate regression analyses for males and females and monitoring conditions were 
performed. 

Subjects' depression scores significantly predicted accuracy in the monitoring condition of the feminine task, 
F(l, 74) = 5.93, p < .02, and marginally for the neutral nonmonitoring condition, F(l, 58) = 3.60, p < .07. 
The higher a subject's depression score, the more s/he underestimated performance. Subjects' depression scores 
significandy predicted their accuracy scores in the masculine monitoring condition, E(l, 88) = 5.80, p < .02. 
The higher a subject's depression score, the more accurately the subject evaluated her/his performance. In the 
feminine and masculine nonmonitoring conditions and the neutral monitoring condition depression scores did 
not significantly predict accuracy, F(l, 83) < 1, F(l, 83) = 1.81, p < .19, F(l, 66) < 1, respectively. Thus, 
with the exception of the masculine monitoring task, depression was not related to accurate self-evaluations, 
which represents a problem for the depressive realism hypothesis. 

Accuracy of self-evaluations. Accuracy of self-evaluations was assessed by subtracting performance from 
self-evaluation scores. Positive discrepancies indicate overestimations, negative numbers underestimations, and 
scores around zero indicate accuracy in self-evaluations. 3x2x2 ANOVAs were conducted on accuracy 
scores. Because the effect of monitoring condition was not significant, F(l, 460) < 1, it was omitted from 
further analyses of accuracy scores. To determine whether an accuracy score is significantly different from 
zero, and whether there is a significant gender difference in accuracy for each task, a repeated measures ANOVA 
was computed with performance and self-evaluations as within-suhjects factors and gender as between-subjects 
factor. 

Women and men did not differ significantly in the accuracy of self-evaluations on the feminine task, F(l> 
159) < 1. Both genders significantly underestimated their performance, F(l, 159) = 26.53, p < .0001. The 
gender difference in accuracy on the neutral task was marginally significant, F(l, 126) = 3.35, p < .07. 
Whereas men's self-evaluations were accurate, F(l, 49) = 2.47, p< .13, women significandy underestimated 
their performance, F(l, 77) = 15.13, p < .0001. Both males and females significantly overestimated their 
performance on the masculine task, F(l, 67) = 45.36, p < .0001, F(l, 106) = 5.94, p < .02, respectively. 
However, as predicted, the gender difference in accuracy of self-evaluations was significant on this task, £(1 , 
173) = 5.00, p < .03. 

S elf-consistencv hypothesis . Self-evaluation was regressed on expectancy, performance, gender, gender- 
typedness of task, monitoring condition, and all the interaction terms in multiple regression analyses. Because 
the effect of monitoring condition was not significant, F(l> 446) < 1, it was omitted from further analyses. 
Gender interacted with expectancies to a significant degree, F(l , 453) = 8.40, p < .004, making separate 
regression analyses for males and females advisable. 

Performance and expectancies accounted for significant amounts of variance in self-evaluations on the 
feminine task for women, F(l, 102) = 33.67, p< .0001, E(l, 102) = 19.04, p< .0001, respectively. These 
effects were present for men also, albeit to a considerably smaller extent: Performance significandy predicted 
self-evaluations, £(1» 52) = 4. 19, p< .05, but expectancies were only marginally significant predictors of 
self-evaluations, E(l, 52) = 3.77, p< .06. 

Performance and expectancies accounted for significant amounts of variance in self-evaluations on the 
neutral task for women, F(l , 74) = 5.60, p< .03, F(l, 74) = 5.96, p< .02, respectively. These effects were 
present for men also: Performance and expectancies significantly predicted self-evaluations, F(l> 46) = 7.30, 
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U< .01, F(l, 46) = 5.27, c< .03, respectively. 

Performance and expectancies accounted for significant amounts of variance in self-evaluations on the 
masculine task for women, F(l, 103) = 57.12, £< .0001, F(l, 103) = 48.57, n< .0001, respectively. As 
hypothesized, for men performance significantly predicted self-evaluations, E(l, 64) = 86.23, ji< .0001, 
whereas expectancy was not a significant predictor of self-evaluations, F(l, 64) = 1.61, p< .21. Thus, as 
hypothesized, women relied more on expectancies when evaluating their performance, i.e., were relatively more 
influenced by self-consistency tendencies, than men, especially on the masculine task. 

Gender differences in recall . It was hypothesized that on the masculine task women would be mom likely to 
recall questions they had answered incorrectly. This was tested by means of a 3 x 2 x 2 analysis of covariance 
with sub jects' performance on a task as covariate and the number of recalled questions which had been 
answered incorrectly divided by total recall as the dependent variable. This analysis take* differences in total 
recall and actual performance into account. As recommended by Winer, Brown, and Michels (1991), arcsine 
transformations were performed on these data. 

Because the effect of monitoring condition was not significant, F(l, 436) < 1, it was omitted from further 
analyses. On neither the feminine, nor the neutral task was the gender difference in the proportion of recalled 
questions which had been answered incorrectly significant, E(l, 143) < 1, F(l, 125) < 1, respectively. On the 
masculine task the gender difference was borderline significant, F(i, 163) = 3.21, {2 < .08. Thus, even with 
performance covaried out, women were somewhat more likely than men to recall questions which they had 
answered incorrectly. 

A further analysis was conducted to analyze gender differences in recall in even more detail. Those recall 
scores representing errors in judgment were analyzed. Two such errors are possible: 1. thinking you answered 
a question incorrectly when in fact you answered it correctly (number of recalled questions that had been 
answered correctly and which the subject thought s/he had answered incorrectly divided by that subject's total 
recall of questions which had in fact been answered correctly) and 2. thinking you answered a question 
correctly when in fact you answered it incorrectly (number of recalled questions that had been answered 
incorrectly and which the subject thought s/he had answered correctly divided by that subject's total recall of 
questions which had in fact been answered incorrectly). Scores were corrected for differences in recall. Arcsine 
transformations were performed on these data. These data were analyzed with a 3 x 2 x 2 ANOVA. 

Because of the presence of several interactions among the variables for the first recall variable (answered 
correctly but subject thought question was answered incorrectly divided by subject's total recall of questions 
which ^ ' ^een answered correctly), the data were analyzed separately by task and monitoring condition. In the 
feminine munitoring condition, the gender difference in recall was not significant, E(l. 64) < 1. In the 
feminine, neutral, and masculine nonmonitoring conditions the gender difference was not significant, F(l , 80) = 
2.48, £ < .12, E(l, 58) - 2.13, £< .15; F(l, 78) < 1, respectively. In the masculine and neutral monitoring 
conditions, some evidence for differential recall was found. The gender difference in recall was borderline 
significant for the neutral task, £(1, 66) = 2.89, £< .10, and significant for the masculine task, EU, 85) = 
4.15, j2 < .05. As predicted, women more often than men judged correctly answered questions as having been 
answered incorrectly. 

Because monitoring condition was not a significant factor in the analyses of the second recall variable 
(questions which had been answered incorrectly, but for which the subject thought the question was answered 
correctly), it was dropped from further analyses of this dependent variable. The gender difference in recall was 
not signifies it for the feminine task, F( 1 , 1 46) < 1 , the neutral task, E (1, 126) = 1.70, n< .20, or the 
masculine task, E(l, 165)= 1.13, p. < .29. 

Thus, on the masculine task information on failure was more mentally available to women than men. 
Interestingly, on the masculine and neutral tasks information on misperceived (as opposed to actual) failure was 
more mentally available for women than for men. This differential recall plus women's reliance on self- 
consistency may explain women's lower estimations of performance. 

Gender differences in False Alarms and Misses . Subjects' confidence statements for each individual 
question in the monitoring condition were analyzed in terms of the proportion of false alarms and misses. A 
talse alarm is an incorrectly answered question for which the subject was highly confident that it was answered 
correctly. A miss is a correctly answered question for which the subject showed little confidence. Misses were 
transformed into proportions by dividing each by a subject's performance score. False alarms were transformed 
into proportions by dividing each by a subject's number of incorrectly answered questions. An arcsine 
transformation was performed on these proportion data. A high proportion of false alarms indicates overly high 
confidence. A high proportion of misses indicates overly low confidence. 

On the feminine task there was no gender difference in the proportions of false alarms and misses, Es(l, 1, 
75 < 1. On the neutral and masculine tasks, men had significantly more false alarms than women i.e., more 
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irequcntly had high confidence when they answered a question incorrectly, EG , 66) = 9.32, g < .003, E(l , 87) 
= 4.08, ji < .05, respectively. Women had significantly more misses on the masculine task than men i.e., more 
frequently had low confidence when they answered a question correctly, F(l, 89) = 10.75, £ < .001. 



Discussion 

This study found little evidence to support the depressive realism hypothesis. In only 3 out of 6 conditions 
was depression significantly related to accuracy of self-perception. However, contrary to the depressive realism 
hypothesis, in 2 out of those 3 conditions depression was related to less accurate self-evaluations. 

The hypothesis that gender differences in accuracy depend on the gender-typedness of the task received 
support. The only task for which the gender difference in accuracy reached significance was the masculine task. 
The results for the neutral task were of borderline significance. However, a surprising quirk in the data was 
found. It was hypothesized that women underestimate their performance on the masculine task. In fact they 
overestimated their performance albeit to a significantly smaller extent than men overestimated their 
performance. This result represents a departure from the results of four previous experiments in which women 
underestimated their performance on the masculine task (Beyer, 1987, 1990, 1991, 1992). With 20/20 
hindsight this result can be explained by the nature of the task that was employed in this experiment. Whereas 
the previous four experiments employed sports and politics questions, the present experiment used math 
questions for the masculine task. The problem with math questions is that in an indirect way feedback on the 
accuracy of one's solution to a problem is provided: after working through a problem, the answer either 
matches or does not match one of the provided response alternatives. When there is a match, the subject is 
likely to assume that s/he answered the question correctly. However, by design, several of the response 
alternatives were meant to mislead subjects by matching incorrect solutions. Thus, a subject might produce a 
match with an incorrect response alternative and assume that this match meant that the question was answered 
correctly. This should lead to an inaccurately high self-evaluation. No such indirect feedback on performance is 
provided by spoils and politics questions nor by the feminine (English language) and neutral (history and 
geography) questions of this experiment. Thus, math questions represent a less than ideal kind of masculine 
items. However, one should not lose sight of the fact that consistent with prior experiments, the only significant 
gender difference in the accuracy of self-evaluations was found on the masculine task. 

Expectancies had a significant effect on self-evaluations, demonstrating the existence of self-consistency 
tendencies. As predicted, performance was the best predictor of self-evaluations on all tasks. For women 
unlike for men, expectancies played almost as important a role as did performance in predicting self-evaluations. 
On the masculine task, men's self-evaluations were predicted only by performance, not by expectancies. This 
indicates that men's self-evaluations are more guided by their performance and women's self-evaluations are 
guided to a considerable extent by self-consistency. Because women's initial expectancies tend to be lower than 
men's, especially on masculine tasks, this reliance on self-consistency when evaluating performance results in 
lower self-evaluations. 

As predicted, gender differences in recall of incorrectly answered questions were found on the masculine 
task. Thus, on the masculine task more information on failure was mentally available to women than men. 
What is even more interesting, however, is that on the masculine and neutral tasks information on mis oerceivcd 
(as opposed to actual) failure was more mentally available for women than for men. Thus, women's recall 
compared to men's was biased in a negative direction. It is important to note that this gender difference in 
biased recall was not evident on the feminine task. This biased recall plus women's reliance on self-consistency 
may explain women's lower estimations of performance. 

Contrary to the hypothesis, when subjects had to state their confidence for each individual question, they 
were no more accurate in their self-evaluations than subjects who were not forced to constantly monitor their 
performance. This points out that simply paying more attention to one's performance does not enhance 
accuracy. 

As hypothesized, when examining the data regai ding the presence of biased self-evaluations at the level of 
individual questions data consistent with the above results were found. On the feminine task for which no 
gender dilference in the accuracy of overall self-evaluations was found, no gender difference in the accuracy of 
selt-evaiuations at the level of individual questions was found. However, on the neutral and masculine tasks for 
which gender differences in overall self-evaluations were present, gender differences in self-evaluations at the 
level of individual questions were found: Men more frequently had high confidence when they answered a 
question incorrectly than did women, whereas women more frequently had low confidence when they answered 
a question correctly than did men. Thus, on the masculine and neutral tasks, biased self-perceptions were 
already operating when evaluating performance on each individual question. This seems to indicate that 
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women's sell-perceptions are not only biased when mentally averaging performance on individual questions into 
one overall self-evaluation, but are biased already at the more fundamental level of evaluating performance on 
individual questions of a task. This statement needs qualification, however. This occurs only for masculine and 
neutral tasks. 

Why is it important to demonstrate that women's self-evaluations are frequently inaccurately low? Positive 
self-perceptions, even if they are inaccurately high, are related to psychological health (Snyder, 1989; Taylor, 
Collins, Skokan, & Aspinwall, 1989), improved motivation, and task persistence (Abramson & Alloy, 1981). 
Low perceptions of performance negatively affect performance (Elliott & Dweck, 1988), persistence (Elliott & 
Dweck, 1988), expectancies for future performance (Phillips, 1987), aspirations (Phillips, 1984), and affect 
(Elliott & Dweck, 1988). Negative self-evaluations combined with stressors can lead to depression (Brown, 
Andrews, Bifulco, & Veiel, 1990; Brown, Andrews, Harris, Adler, & Bridge, 1986; Brown, Bifulco, & 
Andrews, 1990). 

Thus, women's inaccurately low self-evaluations may have damaging consequences. For example, females 
who received high grades in math courses, but nevertheless had low expectancies for future grades, did not 
enroll in advanced math courses (Lantz & Smith, 1981). For males, it was only poor performance which led to 
an avoidance of math courses. Thus, men's future math taking behavior could be predicted by grades 
(performance), whereas women's math taking behavior could be predicted by low expectancies (self- 
consistency). This study of naturalistic behavior nicely supports the findings of this experiment regarding 
differential emphasis on self-consistency and performance by men and women. 

But why do females who received superior grades in math develop low expectancies for future math grades? 
The present experiment suggests that one reason may be females 1 reliance on self-consistency. If females have 
low expectancies for math performance to begin with, they are likely to inaccurately assess their performance in 
math. If inaccurately low self-evaluations affect future expectancies negatively, females are unlikely to take 
more math in the future. This may partially account for the underrepresentation of women in math (Eccles, 
1987). Still, some vexing questions remain. Why would females have low expectancies for math to begin with 
and why does objective feedback such as grades not alter females' expectancies for future math courses? 

Females are socialized to be modest, whereas males are taught to be confident regarding academic 
achievements (Phillips. 1987). Many parents have inaccurately low perceptions of their daughters* ability in 
such areas as math. These low perceptions eventually come to be shared by their daughters (Parsons, Adler, & 
Kaczala, 1982). Thus, females learn from parents and society to underestimate their competence. As this 
research has demonstrated, females learn their lessons well, i.e., women indeed tend to have lower expectancies 
than men. Unfortunately, because of females' reliance on self-consistency, once they have learned their lesson 
(to have low expectancies), they have difficulty unlearning it. 

It is not too difficult to believe that when feedback about actual performance is absent, such as in the present 
research, biases such as self-consistency could come into play. But what about those cases where clear, 
unambiguous feedback regarding performance is available such as in the above-mentioned study by Lantz and 
Smith (1982). Why do so many females who receive feedback regarding performance in the form of high 
grades in math believe that they will do poorly in the future? The recall data may provide some insight here. On 
the masculine task, women were more likely than men to recall questions they believed they answered 
incorrectly. Such biased recall of negative information is likely to affect self-evaluations. If a relatively high 
proportion of information on believed failure is mentally available when evaluating one's performance, this 
should negatively bias self-evaluations. Many of us have known individuals who, after receiving feedback on 
their performance, focus on and remember the tiny bit of criticism rather than the overwhelming amount of 
praise. Perhaps females who receive high grades in math focus on the negative aspects of their performance 
(mistakes) rather than the positive aspects (high grades), therefore perceive their performance as failure and 
avoid math in the future. 

Because of the serious implications of underestimations of performance for self-confidence and 
psychological health, more attention should be devoted to the investigation of gender differences in the accuracy 
of self-evaluations. Such research will not only elucidate the underlying processes of self-evaluation biases and 
therefore be of theoretical interest, but will also be of practical value by suggesting ways of reducing women's 
underestimations of performance. 
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